Data ethics IRL: keeping algorithmic bias out of ML

Vivek Katial
10/02/2026
About Me
- Disclaimer: I am not an expert in ethics
and philosophy
- Vivek Katial (vivek@gooddatainstitute.com)
- Co-founder and Executive Director @ Good Data Institute
- PhD @ Unimelb (Optimisation on Quantum
Computers)
- Visiting PhD Researcher @ NASA Jet Propulsion
lab
- I love traveling and trying new types of food and
meeting interesting people
About Me

Why do we need to consider ethics in Data and AI?

What are / is ethics?
- The word “Ethics” is derived from the Greek word
Ethos, meaning habit or custom.
- Ethics help us distinguish between right and
wrong.
- Many schools of thoughts that philosophers have
argued over for centuries.
- We are going to assume that whatever there is
social consensus on today, is our view on ethics.
Why do we need to consider ethics when building data proucts?
- Fastest growing industry, expected to continue
- Often undesirable consequence can occur. Especially
in relation to, privacy, fairness, etc.
- Considering ethics provides us a framework
to decide what is “OK” to do
Algorithimic Bias – what is it?
- Algorithmic bias refers to the ability of
algorithms to systematically and repeatedly produce outcomes that
benefit one particular group over another
- Already many examples in society where algorithms
have harmed marginalized groups
Trivial Example

- Predictions on the image of the Western bride
included labels such as “bride”, “wedding”, “ceremony”
- For the woman wearing a traditional Indian wedding
dress, the predicted labels were “costume”, “performing arts”,
“event
More Harmful Example

(Lum and Isaac
2016)
The Machine Learning Lifecycle

Data Collection
- We collect, label and prepare data for modelling
and analysis purposes.
- Issues arise when the data collected doesn’t fully
reflect the real world
How can bias arise?
